post-hoc method
Appendixfor" Self-InterpretableModelwith TransformationEquivariant Interpretation "
Please refer to the Appendix 5 for details. Besides, in order to balance the classification loss and the transformation loss we set the scalar factor to beλ = 5 throughoutthetrainingphase. Here the first rows are the untransformed and the transformed images, while the second rows are the corresponding interpretations. This is a supplement toFigure 1 in the main body of the paper. And also there are perturbation methods such as randomized input sampling (RISE) [8] and extremal perturbation (EP) [2].
Quantifying Cross-Attention Interaction in Transformers for Interpreting TCR-pMHC Binding
Li, Jiarui, Yin, Zixiang, Smith, Haley, Ding, Zhengming, Landry, Samuel J., Mettu, Ramgopal R.
CD8+ "killer" T cells and CD4+ "helper" T cells play a central role in the adaptive immune system by recognizing antigens presented by Major Histocompatibility Complex (pMHC) molecules via T Cell Receptors (TCRs). Modeling binding between T cells and the pMHC complex is fundamental to understanding basic mechanisms of human immune response as well as in developing therapies. While transformer-based models such as TULIP have achieved impressive performance in this domain, their black-box nature precludes interpretability and thus limits a deeper mechanistic understanding of T cell response. Most existing post-hoc explainable AI (xAI) methods are confined to encoder-only, co-attention, or model-specific architectures and cannot handle encoder-decoder transformers used in TCR-pMHC modeling. To address this gap, we propose Quantifying Cross-Attention Interaction (QCAI), a new post-hoc method designed to interpret the cross-attention mechanisms in transformer decoders. Quantitative evaluation is a challenge for XAI methods; we have compiled TCR-XAI, a benchmark consisting of 274 experimentally determined TCR-pMHC structures to serve as ground truth for binding. Using these structures we compute physical distances between relevant amino acid residues in the TCR-pMHC interaction region and evaluate how well our method and others estimate the importance of residues in this region across the dataset. We show that QCAI achieves state-of-the-art performance on both interpretability and prediction accuracy under the TCR-XAI benchmark. T cells play a pivotal role in the adaptive immune system by identifying and responding to antigenic proteins, both from pathogens such as viruses, bacteria and cancer cells (Joglekar & Li, 2021) as well as in the context of autoimmunity. The final and arguably most critical component of T cell response is binding between the peptide Major Histocompatibility Complex (pMHC) which contains an antigenic peptide bound to a MHC molecule and the surface receptor on T cells (TCR). The specificity of this interaction underpins T cell-mediated immunity and is an intense area of research in both the development of therapies and fundamental understanding of immune response. Understanding T cell response is the key to vaccines that confer long-lasting immunity, and can also enable effective personalized cancer therapies (Rojas et al., 2023; Poorebrahim et al., 2021). Transformer models have recently been use to analyze T cell immunity (Hudson et al., 2023; Li et al., 2023; Karthikeyan et al., 2023; Driessen et al., 2024; Cornwall et al., 2023).
- North America > United States (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Overview (0.67)
Appendix for " Self-Interpretable Model with Transformation Equivariant Interpretation "
Please refer to the Appendix 5 for details. Here we only present correctly classified examples. Here the first rows are the untransformed and the transformed images, while the second rows are the corresponding interpretations. This is a supplement to Figure 1 in the main body of the paper . To illustrate the comparison results consistently, we use heatmaps of the same settings to visualize the interpretations of all methods.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
My Answer Is NOT 'Fair': Mitigating Social Bias in Vision-Language Models via Fair and Biased Residuals
Lan, Jian, Fu, Yifei, Schlegel, Udo, Zhang, Gengyuan, Hannan, Tanveer, Chen, Haokun, Seidl, Thomas
Social bias is a critical issue in large vision-language models (VLMs), where fairness- and ethics-related problems harm certain groups of people in society. It is unknown to what extent VLMs yield social bias in generative responses. In this study, we focus on evaluating and mitigating social bias on both the model's response and probability distribution. To do so, we first evaluate four state-of-the-art VLMs on PAIRS and SocialCounterfactuals datasets with the multiple-choice selection task. Surprisingly, we find that models suffer from generating gender-biased or race-biased responses. We also observe that models are prone to stating their responses are fair, but indeed having mis-calibrated confidence levels towards particular social groups. While investigating why VLMs are unfair in this study, we observe that VLMs' hidden layers exhibit substantial fluctuations in fairness levels. Meanwhile, residuals in each layer show mixed effects on fairness, with some contributing positively while some lead to increased bias. Based on these findings, we propose a post-hoc method for the inference stage to mitigate social bias, which is training-free and model-agnostic. We achieve this by ablating bias-associated residuals while amplifying fairness-associated residuals on model hidden layers during inference. We demonstrate that our post-hoc method outperforms the competing training strategies, helping VLMs have fairer responses and more reliable confidence levels.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Europe > Middle East > Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)
- (4 more...)
Soft-CAM: Making black box models self-explainable for high-stakes decisions
Djoumessi, Kerol, Berens, Philipp
Convolutional neural networks (CNNs) are widely used for high-stakes applications like medicine, often surpassing human performance. However, most explanation methods rely on post-hoc attribution, approximating the decision-making process of already trained black-box models. These methods are often sensitive, unreliable, and fail to reflect true model reasoning, limiting their trustworthiness in critical applications. In this work, we introduce SoftCAM, a straightforward yet effective approach that makes standard CNN architectures inherently interpretable. By removing the global average pooling layer and replacing the fully connected classification layer with a convolution-based class evidence layer, SoftCAM preserves spatial information and produces explicit class activation maps that form the basis of the model's predictions. Evaluated on three medical datasets, SoftCAM maintains classification performance while significantly improving both the qualitative and quantitative explanation compared to existing post-hoc methods. Our results demonstrate that CNNs can be inherently interpretable without compromising performance, advancing the development of self-explainable deep learning for high-stakes decision-making.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons
Bassan, Shahaf, Eliav, Ron, Gur, Shlomit
*Minimal sufficient reasons* represent a prevalent form of explanation - the smallest subset of input features which, when held constant at their corresponding values, ensure that the prediction remains unchanged. Previous *post-hoc* methods attempt to obtain such explanations but face two main limitations: (1) Obtaining these subsets poses a computational challenge, leading most scalable methods to converge towards suboptimal, less meaningful subsets; (2) These methods heavily rely on sampling out-of-distribution input assignments, potentially resulting in counterintuitive behaviors. To tackle these limitations, we propose in this work a self-supervised training approach, which we term *sufficient subset training* (SST). Using SST, we train models to generate concise sufficient reasons for their predictions as an integral part of their output. Our results indicate that our framework produces succinct and faithful subsets substantially more efficiently than competing post-hoc methods, while maintaining comparable predictive performance.
- Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
In Defence of Post-hoc Explainability
The widespread adoption of machine learning in scientific research has created a fundamental tension between model opacity and scientific understanding. Whilst some advocate for intrinsically interpretable models, we introduce Computational Interpretabilism (CI) as a philosophical framework for post-hoc interpretability in scientific AI. Drawing parallels with human expertise, where post-hoc rationalisation coexists with reliable performance, CI establishes that scientific knowledge emerges through structured model interpretation when properly bounded by empirical validation. Through mediated understanding and bounded factivity, we demonstrate how post-hoc methods achieve epistemically justified insights without requiring complete mechanical transparency, resolving tensions between model complexity and scientific comprehension.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Law (0.68)
- Health & Medicine (0.67)
Revisiting the robustness of post-hoc interpretability methods
Wei, Jiawen, Turbé, Hugues, Mengaldo, Gianmarco
Post-hoc interpretability methods play a critical role in explainable artificial intelligence (XAI), as they pinpoint portions of data that a trained deep learning model deemed important to make a decision. However, different post-hoc interpretability methods often provide different results, casting doubts on their accuracy. For this reason, several evaluation strategies have been proposed to understand the accuracy of post-hoc interpretability. Many of these evaluation strategies provide a coarse-grained assessment -- i.e., they evaluate how the performance of the model degrades on average by corrupting different data points across multiple samples. While these strategies are effective in selecting the post-hoc interpretability method that is most reliable on average, they fail to provide a sample-level, also referred to as fine-grained, assessment. In other words, they do not measure the robustness of post-hoc interpretability methods. We propose an approach and two new metrics to provide a fine-grained assessment of post-hoc interpretability methods. We show that the robustness is generally linked to its coarse-grained performance.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > Singapore > Central Region > Singapore (0.04)
- Europe > Switzerland > Geneva > Geneva (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Towards Understanding Sensitive and Decisive Patterns in Explainable AI: A Case Study of Model Interpretation in Geometric Deep Learning
Zhu, Jiajun, Miao, Siqi, Ying, Rex, Li, Pan
The interpretability of machine learning models has gained increasing attention, particularly in scientific domains where high precision and accountability are crucial. This research focuses on distinguishing between two critical data patterns -- sensitive patterns (model-related) and decisive patterns (task-related) -- which are commonly used as model interpretations but often lead to confusion. Specifically, this study compares the effectiveness of two main streams of interpretation methods: post-hoc methods and self-interpretable methods, in detecting these patterns. Recently, geometric deep learning (GDL) has shown superior predictive performance in various scientific applications, creating an urgent need for principled interpretation methods. Therefore, we conduct our study using several representative GDL applications as case studies. We evaluate thirteen interpretation methods applied to three major GDL backbone models, using four scientific datasets to assess how well these methods identify sensitive and decisive patterns. Our findings indicate that post-hoc methods tend to provide interpretations better aligned with sensitive patterns, whereas certain self-interpretable methods exhibit strong and stable performance in detecting decisive patterns. Additionally, our study offers valuable insights into improving the reliability of these interpretation methods. For example, ensembling post-hoc interpretations from multiple models trained on the same task can effectively uncover the task's decisive patterns.
- North America > United States (0.28)
- Europe > Austria (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (6 more...)